rank | frequency | n-gram |
---|---|---|
1 | 14336 | -ی |
2 | 7128 | -ن |
3 | 7042 | -، |
4 | 4743 | -ه |
5 | 4645 | -ا |
rank | frequency | n-gram |
---|---|---|
1 | 3591 | -ای |
2 | 3442 | -ان |
3 | 1976 | -ها |
4 | 1451 | -ین |
5 | 1411 | -ری |
rank | frequency | n-gram |
---|---|---|
1 | 2473 | -های |
2 | 1262 | -ها |
3 | 970 | -ایی |
4 | 722 | -انی |
5 | 563 | -ای |
rank | frequency | n-gram |
---|---|---|
1 | 1553 | -های |
2 | 515 | -های |
3 | 468 | -هایی |
4 | 330 | -هها |
5 | 241 | -ترین |
rank | frequency | n-gram |
---|---|---|
1 | 462 | -ههای |
2 | 299 | -هایی |
3 | 240 | -یهای |
4 | 200 | -انوار |
5 | 174 | -نهای |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings